**Tutorial 09:** WRES1201 – Computer System Architecture

1. Give those characteristics distinguish RISC's organization.

**- a limited and simple instruction set,**

**- a large number of registers or the use of a compiler that optimizes register usage, and**

**- an emphasis on optimizing the instruction pipeline**.

1. Briefly describe 2 basic approach used to minimize operation register-memory in RISC machine.

**sw = use compiler to maximize register usage**

**hw = use more register (so that variable can stay longer)**

1. List and briefly define three types of computer system organization.

Single Instruction Single Data (SISD):

* A single processor executes a single instruction stream to operate on data stored in a single memory.

Single Instruction Multiple Data (SIMD):

* A single machine instruction controls the simultaneous execution of a number of processing elements on a lockstep basis.
* Each processing element has an associated data memory, so that instructions are executed on different sets of data by different processors.

Multiple Instruction Multiple Data (MIMD):

* A set of processors simultaneously execute different instruction sequences on different data sets. SMPs, Cluster and NUMA systems fit into this category.

1. What are the main characteristics of an SMP?
2. Multi processor of comparable capability
3. Share the same main memory and I/O facilities
4. Share access to I/O
5. All processors can perform the same function
6. The system is controlled by an integrated operating system.
7. What is the difference between software and hardware cache coherent schemes?

sw = handle by compiler or OS.

* Compiler analyze the code and determine which data items may become unsafe for caching and mark those items accordingly.
* Compiler analyze the code and determine safe periods for shared variables.

hw = implement in logic circuit

* Directory Protocols
* Snoopy Protocols

1. What is the meaning of each of the four states in the MESI protocol?

**Modified:** The line in the cache has been modified (different from main memory) and is available only in this cache.

**Exclusive:** The line in the cache is the same as that in main memory and is not present in any other cache.

**Shared:** The line in the cache is the same as that in main memory and may be present in another cache.

**Invalid:** The line in the cache does not contain valid data.

1. What are some of the benefits of clustering?

**Absolute scalability**: Possible to create large clusters that far surpass the power of even the largest standalone machines.

**Incremental scalability**: Possible to add new systems to the cluster in small increments without having to go through a major upgrade.

**High availability**: Each node is a standalone computer, the failure of one node does not mean loss of service.

**Superior price/performance**: Possible to have a cluster with equal or greater computing power than a single large machine at much lower cost.

**Homework**

1. Let α be the percentage of program code that can be executed simultaneously by *n* processors in a computer system. Assume that the remaining code must be executed sequentially by single processor. Each processor has an execution rate of *x* MIPS.
   1. Derive an expression for the effective MIPS rate when using the system for exclusive execution of this program in term of α, *n* and *x*.
   2. If n= 16 and x= 6 MIPS, determine the value of α that will yield a system performance of 54 MIPS.
2. MIPS rate = [*n* α+ (1 – α)] *x* = (*n* α– α+ 1)*x*
3. α=8/15 = 0.533
4. A program consists of five tasks, which have execution times of 2000, 4000, 6000, 8000 and 10,000 cycles. It is not possible to divide the execution of one task among multiple processors, but there are no communication or synchronization costs. If the task are distributed across the processor to achieve the shortest execution time, what is the speedup for executing the program on four processor?

**Execution time for single processor =**

**2000 + 4000 + 6000 + 8000 + 10000 = 30,000 cycles**

**Execution time with four processor = 10,000 cycles**

**Speedup = 30,000/10,000 = 3.0**